Boosting nearest shrunken centroid classifier for microarray data
نویسنده
چکیده
Nearest shrunken centroid classifier (NSC) is a class of linear classifiers with built-in feature selections, and has proven useful for analyzing microarray data. The simple linear structure of the classification boundary makes NSC easy to interpret and implement, but sometimes this simple structure might fail to generalize well for some data. In this paper we propose boosting NSC to improve its performance, which is based on the development of a novel penalized weighted linear regression model. Through application to public microarray data we illustrate the favorable performance of the proposed boosted NSC. Supplementary information can be found at http://www.biostat. umn.edu/~baolin/research/BstNSC.html.
منابع مشابه
Nearest Shrunken Centroid as Feature Selection of Microarray Data
The nearest shrunken centroid classifier uses shrunken centroids as prototypes for each class and test samples are classified to belong to the class whose shrunken centroid is nearest to it. In our study, the nearest shrunken centroid classifier was used simply to select important genes prior to classification. Random Forest, a decision tree based classification algorithm, is chosen as a classi...
متن کاملImproved nearest centroid classifier with shrunken distance measure for null LDA method on cancer classification problem
Null linear discriminant analysis (LDA) is a well-known dimensionality reduction technique for the small sample size problem. When the null LDA technique projects the samples to a lower dimensional space, the covariance matrices of individual classes become zero, i.e. all the projected vectors of a given class merge into a single vector. In this case, only the nearest centroid classifier (NCC) ...
متن کاملFlexible prediction analysis of microarrays
In this paper, we study the widely used nearest shrunken centroid classifier (NSC, also known as PAM) for microarray data from the supervised dimension reduction perspective. A simple modification is proposed and through application to public microarray data, we illustrate the favorable performance of the proposed method. Supplementary information can be found at http://www.biostat.umn. edu/~ba...
متن کاملImproved centroids estimation for the nearest shrunken centroid classifier
MOTIVATION The nearest shrunken centroid (NSC) method has been successfully applied in many DNA-microarray classification problems. The NSC uses 'shrunken' centroids as prototypes for each class and identifies subsets of genes that best characterize each class. Classification is then made to the nearest (shrunken) centroid. The NSC is very easy to implement and very easy to interpret, however, ...
متن کاملClassification of Anti-learnable Biological and Synthetic Data
We demonstrate a binary classification problem in which standard supervised learning algorithms such as linear and kernel SVM, naive Bayes, ridge regression, k-nearest neighbors, shrunken centroid, multilayer perceptron and decision trees perform in an unusual way. On certain data sets they classify a randomly sampled training subset nearly perfectly, but systematically perform worse than rando...
متن کامل